feat: add Cacharr — rescue Riven items stuck in Scraped/Failed/Indexed via Prowlarr + RealDebrid#165
Conversation
Cacharr rescues items stuck in Riven's Scraped/Failed/Indexed states by searching Prowlarr for torrents and adding them to RealDebrid to trigger server-side caching, then resetting the item so Riven picks it up once cached. Nothing downloads locally — RD fetches from seeders to their servers. Changes: - cacharr/cacharr.py — main daemon (~2280 lines); season-aware torrent selection, search-miss cooldown with exponential back-off, tried-hash TTL, stale-RD-content detection, and a web UI on port 8484 - cacharr/sync_library.py — optional companion that marks Riven items as Completed when the file already exists in the Radarr/Sonarr library - utils/cacharr_settings.py — auto-injects Prowlarr API key from config.xml into Cacharr's env config on first start - utils/auto_update.py — post-start hook: runs patch_cacharr_config() and restarts Cacharr if the Prowlarr key was freshly injected - utils/dependency_map.py — cacharr depends on riven_backend - main.py — cacharr added to grouped_keys (after riven_frontend) - utils/dumb_config.json — default cacharr config (disabled by default) - utils/dumb_config_schema.json — schema for cacharr config block - .env.example — CACHARR_* env var documentation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
GitKraken automatically performed 2 actions
Create your own automations at gitkraken.dev/automations |
📝 WalkthroughWalkthroughAdds a new Cacharr service: environment/schema entries, startup integration and dependency mapping, Prowlarr credential injection with automatic restart, and a filesystem-to-database library sync script that marks MediaItems completed with symlink metadata. Changes
Sequence DiagramsequenceDiagram
participant FS as Radarr/Sonarr Folders
participant Sync as Cacharr Sync Script
participant DB as Riven PostgreSQL
participant Config as Cacharr Process/Config
FS->>Sync: Enumerate movie/show folders\nExtract {imdb-tt...}, seasons, files
Sync->>DB: Query MediaItem / Season / Episode by ids
DB-->>Sync: Return records
alt Item not already completed+symlinked
Sync->>Sync: Resolve real file & folder (follow symlink)
Sync->>DB: UPDATE MediaItem set last_state='Completed', symlinked=true,\nsymlinked_at=now(), symlink_path=..., file=..., folder=...
DB-->>Sync: Commit
end
note right of Config: On service start, patch_cacharr_config() may inject\nPROWLARR_KEY and trigger Config restart
Config->>Config: Restart when config patched
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
utils/dumb_config_schema.json (1)
3520-3548:⚠️ Potential issue | 🔴 CriticalDon't require
cacharrduring config validation yet.
CONFIG_MANAGER._load_and_validate_config()validates the persisted config before any default merge/migration runs. Existing installs upgrading from a config file that predates this service will now fail startup because the top-levelcacharrkey is missing, even though the service is disabled by default.Suggested fix
"required": [ "puid", "pgid", "tz", "data_root", "dumb", "traefik", "cli_debrid", "cli_battery", "decypharr", "nzbdav", "emby", "jellyfin", "phalanx_db", "plex", "tautulli", "neutarr", "seerr_sync", "profilarr", "seerr", "plex_debrid", "postgres", "pgadmin", "prowlarr", "radarr", "rclone", "riven_backend", "riven_frontend", - "cacharr", "sonarr", "whisparr", "zilean", "zurg" ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@utils/dumb_config_schema.json` around lines 3520 - 3548, The JSON schema currently lists "cacharr" in the top-level "required" array causing CONFIG_MANAGER._load_and_validate_config() to reject older configs; remove "cacharr" from the required array in utils/dumb_config_schema.json (or make it optional) so validation no longer fails when that top-level key is absent, ensuring the existing merge/migration logic that adds defaults for the riven services can run normally.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cacharr/sync_library.py`:
- Line 114: The current ep_re only captures a single episode (first E##) so
filenames like S01E01E02 or S01E01-02 leave subsequent episodes unresolved;
update the logic around ep_re to extract all episode numbers by using a regex
that finds all episode tokens (e.g., multiple E## occurrences and ranges like
E01-03), then expand ranges into individual episode numbers and iterate over
them when marking files/records (update the variable ep_re and the code paths
that consume it in the block using ep_re and the code between the original ep_re
usage and the handling in lines ~170-198); ensure you replace single-match logic
with re.findall and range-expansion so every episode from the filename is
processed.
- Around line 45-59: _find_video currently returns the first video from
os.listdir() which is nondeterministic and can pick samples/trailers; change it
to collect all files in folder_path whose extensions are in VIDEO_EXTS, filter
out filenames containing keywords like "sample" or "trailer" (case-insensitive),
and if multiple candidates remain choose the most likely main asset by selecting
the largest file (use os.path.getsize) with a deterministic tie-breaker (e.g.,
sort by name) before returning os.path.join(folder_path, chosen); still return
None if no valid candidate or if os.listdir raises OSError.
In `@utils/auto_update.py`:
- Around line 2202-2221: The restart logic for "cacharr" uses the pre-patch env
snapshot and discards the second start result, so the restarted process still
gets the old PROWLARR_KEY and failures are ignored; after calling
patch_cacharr_config() (which mutates CONFIG_MANAGER) rebuild the env (or
re-read PROWLARR_KEY from CONFIG_MANAGER) before calling
process_handler.start_process, call stop_process first, then capture and return
the result/tuple from the second process_handler.start_process invocation
(instead of discarding it) so failures are propagated; keep the existing
shutting_down check around the flow and reference patch_cacharr_config,
CONFIG_MANAGER, PROWLARR_KEY, process_handler.stop_process and
process_handler.start_process when making the change.
In `@utils/cacharr_settings.py`:
- Around line 57-64: The code injects PROWLARR_KEY via _read_prowlarr_api_key()
but doesn't set PROWLARR_URL, so a non-standard local port is ignored; update
the logic that calls _read_prowlarr_api_key() to also discover or return the
local Prowlarr base URL (e.g., add or use a helper like _discover_prowlarr_url()
or have _read_prowlarr_api_key() return both key and url) and include
"PROWLARR_URL": discovered_url when calling CONFIG_MANAGER.update("cacharr",
{"env": {**env, "PROWLARR_KEY": api_key, "PROWLARR_URL": discovered_url}}) so
the injected key and correct URL are both persisted.
---
Outside diff comments:
In `@utils/dumb_config_schema.json`:
- Around line 3520-3548: The JSON schema currently lists "cacharr" in the
top-level "required" array causing CONFIG_MANAGER._load_and_validate_config() to
reject older configs; remove "cacharr" from the required array in
utils/dumb_config_schema.json (or make it optional) so validation no longer
fails when that top-level key is absent, ensuring the existing merge/migration
logic that adds defaults for the riven services can run normally.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 5e9bed70-fc16-4b53-9556-5919a2ad6902
📒 Files selected for processing (9)
.env.examplecacharr/cacharr.pycacharr/sync_library.pymain.pyutils/auto_update.pyutils/cacharr_settings.pyutils/dependency_map.pyutils/dumb_config.jsonutils/dumb_config_schema.json
- dumb_config_schema.json: remove cacharr from top-level required list so existing configs without the key don't fail startup validation - sync_library.py: _find_video now sorts entries and filters out sample/trailer/extras filenames before picking the main asset, falling back to the first video alphabetically if all files match the junk pattern; add _extract_ep_nums helper and update sync_episodes to iterate all episode numbers from multi-episode filenames (S01E01E02, S01E01-03 ranges) so every episode record gets marked Completed - auto_update.py: after patch_cacharr_config() mutates CONFIG_MANAGER, reload the fresh env dict before restarting so the relaunched process receives the injected PROWLARR_KEY instead of the pre-patch snapshot; also capture the start_process return value so a failed restart is not silently swallowed - cacharr_settings.py: rename _read_prowlarr_api_key to _discover_prowlarr which now returns (api_key, base_url); patch_cacharr_config injects PROWLARR_URL alongside PROWLARR_KEY so a non-default Prowlarr port is automatically picked up Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
utils/cacharr_settings.py (1)
35-35: Consider hardened XML parsing for security:defusedxmlis not currently a declared dependency.Line 35 uses stdlib
xml.etree.ElementTree, which is vulnerable to XML bomb attacks (Billion Laughs, Quadratic blowup). Narrower exception handling is also recommended at line 39. Ifdefusedxmlis added as a dependency, update:
- Import:
from defusedxml import ElementTree as ET- Exception handler:
except (ET.ParseError, OSError, ValueError) as exc:Note: The same pattern exists in multiple files (
prowlarr_settings.py,plex_settings.py,emby_settings.py, etc.).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@utils/cacharr_settings.py` at line 35, Replace usage of xml.etree.ElementTree with defusedxml's safe parser by changing the import to "from defusedxml import ElementTree as ET" and update the error handling around ET.parse(config_file) (the ET.parse call) to catch the narrower tuple of exceptions "ET.ParseError, OSError, ValueError" (i.e., use except (ET.ParseError, OSError, ValueError) as exc). Apply the same change pattern to the other modules that call ET.parse such as prowlarr_settings.py, plex_settings.py, and emby_settings.py so all XML parsing is hardened.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@utils/cacharr_settings.py`:
- Around line 26-28: The code assumes CONFIG_MANAGER.get("prowlarr") and nested
values are dicts and directly calls instances.items() and env.get(...), which
can raise AttributeError if the config shape is wrong; update the prowlarr
handling around prowlarr_cfg and instances to validate types (e.g., check
isinstance(prowlarr_cfg, dict) and isinstance(instances, dict) and default to {}
when not) and before using env.get(...) validate env is a dict (or coerce via
dict(env) safely) so that calls to instances.items() and env.get(...) are
guarded against malformed config values.
---
Nitpick comments:
In `@utils/cacharr_settings.py`:
- Line 35: Replace usage of xml.etree.ElementTree with defusedxml's safe parser
by changing the import to "from defusedxml import ElementTree as ET" and update
the error handling around ET.parse(config_file) (the ET.parse call) to catch the
narrower tuple of exceptions "ET.ParseError, OSError, ValueError" (i.e., use
except (ET.ParseError, OSError, ValueError) as exc). Apply the same change
pattern to the other modules that call ET.parse such as prowlarr_settings.py,
plex_settings.py, and emby_settings.py so all XML parsing is hardened.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 91590c73-e309-4f57-9c6d-fd043df118be
📒 Files selected for processing (4)
cacharr/sync_library.pyutils/auto_update.pyutils/cacharr_settings.pyutils/dumb_config_schema.json
✅ Files skipped from review due to trivial changes (2)
- utils/auto_update.py
- cacharr/sync_library.py
🚧 Files skipped from review as they are similar to previous changes (1)
- utils/dumb_config_schema.json
- Add isinstance(instances, dict) guard in _discover_prowlarr() to handle unexpected config shapes gracefully - Add isinstance(env, dict) guard in patch_cacharr_config() with warning log - Rename inst_key to _inst_key (ruff B007 - unused loop variable) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
utils/cacharr_settings.py (1)
26-27:⚠️ Potential issue | 🟠 MajorGuard
prowlarr_cfgtype before nested dict access
prowlarr_cfg.get("instances")can still throw ifCONFIG_MANAGER.get("prowlarr")is not a dict. Line 26 should validate shape first.Suggested fix
- prowlarr_cfg = CONFIG_MANAGER.get("prowlarr") or {} - instances = prowlarr_cfg.get("instances") or {} + prowlarr_cfg = CONFIG_MANAGER.get("prowlarr") or {} + if not isinstance(prowlarr_cfg, dict): + logger.warning("Cacharr: invalid prowlarr config type: %s", type(prowlarr_cfg).__name__) + return "", "" + instances = prowlarr_cfg.get("instances") or {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@utils/cacharr_settings.py` around lines 26 - 27, Validate that the value returned by CONFIG_MANAGER.get("prowlarr") is a dict before calling .get on it: replace the direct nested access using prowlarr_cfg.get("instances") with a guarded check (e.g., if not isinstance(prowlarr_cfg, dict): prowlarr_cfg = {}) then assign instances = prowlarr_cfg.get("instances", {}) so that prowlarr_cfg and the variable names prowlarr_cfg and instances are protected from non-dict types and you avoid AttributeError during nested dict access.
🧹 Nitpick comments (1)
utils/cacharr_settings.py (1)
42-42: Narrow broad exception handlers to expected failure typesLines 42 and 94 catch
Exception, which masks unrelated failures. Catch expected exception types (ET.ParseError,OSError, and concrete config-write exceptions) and log context.Also applies to: 94-94
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@utils/cacharr_settings.py` at line 42, Replace the two broad "except Exception as exc:" handlers in cacharr_settings.py (the one around the XML parsing logic and the one around the config-write block) with narrow exception clauses that only catch expected failures — for XML parsing use ET.ParseError, for filesystem or IO issues use OSError (or more specific IOError/FileNotFoundError as appropriate), and for config-write operations catch the concrete exception(s) raised by your config writer; update the corresponding processLogger.error calls to handle the specific exception variable so context is preserved. Ensure you do not swallow other exceptions by letting unexpected exceptions propagate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@utils/cacharr_settings.py`:
- Line 89: CONFIG_MANAGER.update(...) is not defined; use the provided
ConfigManager.set method instead. Replace the call to
CONFIG_MANAGER.update("cacharr", {"env": {**env, **updates}}) with a call to
CONFIG_MANAGER.set using the same key and merged env payload (i.e., invoke
ConfigManager.set("cacharr", {"env": {**env, **updates}}) so the settings are
persisted correctly). Verify the symbol CONFIG_MANAGER and method set(...) are
used consistently in cacharr_settings.py.
- Line 38: Replace the insecure xml.etree.ElementTree usage with defusedxml's
hardened parser: change the module import (where ET is currently from
xml.etree.ElementTree) to from defusedxml import ElementTree as ET and keep the
existing ET.parse(config_file) call in utils/cacharr_settings.py; ensure any
other usages of ET in that module remain compatible, and add defusedxml to
project dependencies (requirements/pyproject) so the secure parser is installed.
---
Duplicate comments:
In `@utils/cacharr_settings.py`:
- Around line 26-27: Validate that the value returned by
CONFIG_MANAGER.get("prowlarr") is a dict before calling .get on it: replace the
direct nested access using prowlarr_cfg.get("instances") with a guarded check
(e.g., if not isinstance(prowlarr_cfg, dict): prowlarr_cfg = {}) then assign
instances = prowlarr_cfg.get("instances", {}) so that prowlarr_cfg and the
variable names prowlarr_cfg and instances are protected from non-dict types and
you avoid AttributeError during nested dict access.
---
Nitpick comments:
In `@utils/cacharr_settings.py`:
- Line 42: Replace the two broad "except Exception as exc:" handlers in
cacharr_settings.py (the one around the XML parsing logic and the one around the
config-write block) with narrow exception clauses that only catch expected
failures — for XML parsing use ET.ParseError, for filesystem or IO issues use
OSError (or more specific IOError/FileNotFoundError as appropriate), and for
config-write operations catch the concrete exception(s) raised by your config
writer; update the corresponding processLogger.error calls to handle the
specific exception variable so context is preserved. Ensure you do not swallow
other exceptions by letting unexpected exceptions propagate.
Summary
Adds Cacharr as a first-class DUMB service — a daemon that rescues items stuck in Riven's
Scraped,Failed, orIndexedstates by independently searching Prowlarr for torrents and submitting them to RealDebrid to trigger server-side caching. Once RD finishes, Riven picks the item up automatically. Nothing downloads locally.tried_hasheswith 7-day TTL — hashes rejected by RD are automatically retried after a weekCompleteditems' RD torrents still exist; resets and re-queues if RD purged themsync_library.pycompanion — scans the Radarr/Sonarr library for files already placed by Decypharr and marks matching RivenMediaItemrecords asCompleted, preventing redundant re-scrapingrequestsandpsycopg2, both already present in DUMB's/venvFiles changed
cacharr/cacharr.pycacharr/sync_library.pyutils/cacharr_settings.pyconfig.xml/ instance port on first startutils/auto_update.pypatch_cacharr_config(), reloads fresh env before restartutils/dependency_map.pycacharrdepends onriven_backendmain.pycacharradded togrouped_keys(afterriven_frontend)utils/dumb_config.jsonutils/dumb_config_schema.jsonrequired).env.exampleCACHARR_*env var documentationDefaults
Cacharr is disabled by default. Enable it in the DUMB UI or via:
The Prowlarr API key and base URL are auto-injected from Prowlarr's
config.xmland instance port on first start. The RealDebrid API key is read automatically from Riven's existingsettings.json. Everything else runs on localhost defaults — no extra configuration needed for a standard DUMB setup.Changes since initial commit (CodeRabbit review)
cacharrin schemarequired[]breaks existing installsrequired— config validation no longer rejects older configs missing the key_find_videonondeterministic (os.listdirorder)sample/trailer/extras/featurettefilenames before picking the main assetS01E01E02,S01E01-03) only updated first episode_extract_ep_nums()expands multi-episode and range tokens; loop iterates all episode numbers per fileenvsnapshot (injected key missing)CONFIG_MANAGER.get("cacharr")after patch and merges into env before restarting;start_processreturn value now capturedPROWLARR_KEYinjected, notPROWLARR_URL_discover_prowlarr()returns(api_key, base_url); both are injected when not already set so non-default ports workTest plan
docker compose build && docker compose up -d— container starts without errorsCACHARR_ENABLED=true, confirm Cacharr process appears in DUMB UI/log/cacharr.logcacharrkey) — confirm DUMB starts normallyCACHARR_ENABLED=false(default) — confirm DUMB starts normally without CacharrRelated PRs
CACHARRservice key enum and log parser so Cacharr appears in the DUMB UI with properly formatted logsContext
This grew out of Discussion #164 where I shared Cacharr as a community tool after running it for ~2 months in my own DUMB setup (~370 items resolved, ~53% success rate). PUID-0 expressed interest in first-class integration, hence this PR.
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Chores